Skip to content

BUG: Fix unpickling of string dtypes of legacy pandas versions #61770

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Liam3851
Copy link
Contributor

@Liam3851 Liam3851 commented Jul 3, 2025

@Liam3851 Liam3851 marked this pull request as ready for review July 3, 2025 23:09
@Liam3851 Liam3851 changed the title Fix unpickling of string dtypes of legacy pandas versions BUG: Fix unpickling of string dtypes of legacy pandas versions Jul 4, 2025
@jorisvandenbossche jorisvandenbossche added Bug Strings String extension data type and string data IO Pickle read_pickle, to_pickle labels Jul 4, 2025
@jorisvandenbossche jorisvandenbossche added this to the 2.3.1 milestone Jul 4, 2025
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Liam3851 thanks a lot for the bug report and the fix!

Looks perfect, and thanks for adding legacy data (we should probably also add some data for 2.0-2.2 ..).

Can you add a note in the doc/source/whatsnew/v2.3.1.rst file? Because we will want to backport this fix

@@ -218,6 +220,10 @@ def __eq__(self, other: object) -> bool:
return self.storage == other.storage and self.na_value is other.na_value
return False

def __setstate__(self, state: MutableMapping[str, Any]) -> None:
self.storage = state.pop("storage", "python")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.storage = state.pop("storage", "python")
# back-compat for pandas < 2.3, where na_value did not yet exist
self.storage = state.pop("storage", "python")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Pickle read_pickle, to_pickle Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: StringDtype objects from pandas <2.3.0 cannot be reliably unpickled in 2.3.0.
2 participants